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RAW SEQUENCE LISTING DATE : 05/31/2002 
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Of 



ENTERED 



Input Set : A:\Xen-001.app 
Output Set: N:\CRF3\05312002\J082671.raw 

3 <110> APPLICANT : DAHIYAT, BASSIL 

4 LI, MIN 

6 <120> TITLE OF INVENTION: USE OF NUCLEIC ACID LIBRARIES TO CREATE TOXICOLOGICAL 

7 PROFILES 

9 FILE REFERENCE: XEN/001 

11 *]40- CURRENT APPLICATION NUMBER: 10/082,671 

12 <14L> CURRENT FILING DATE: 2002-05-17 

14 *150> PRIOR APPLICATION NUMBER: 60/270,781 

15 <151> PRIOR FILING DATE: 2001-02-22 
17 < 16 0- NUMBER OF SEQ ID NOS : 58 
19 < I 7 0 :> SOFTWARE: Patentln Ver . 2.1 
21 '.210"> SEQ ID NO: 1 
2 2 * 211* LENGTH: 9 
2.3 <212> TYPE: PRT 

24 • 213 > ORGANISM: Artificial Sequence 
2^ •■ jl'O - FEATURE : 

27 <J23* OTHER INFORMATION: Description of Artificial Sequence: Synthetic 

2 8 peptide 

3 0 <220* FEATURE: 

31 «221> NAME/KEY: MODJRES 

32 <-222> LOCATION: (1) . . (3) 

3 3 <223> OTHER INFORMATION: Variable amino acid 
3 5 -'220 > FEATURE: 

3 6 •."221 > NAME/KEY: MOD_RES 

37 < 222> LOCATION: (6) 

38 <223> OTHER INFORMATION: Variable amino acid 
40 '-220 > FEATURE: 

4 1 ■ 2 21 > NAME/KEY: MOD_RES 

42 • 222> LOCATION: (8) . . (9) 

43 - 2-'3> OTHER INFORMATION: Variable amino acid 
4 5 * 4 00 > SEQUENCE: 1 

4 6 Xaa Xaa Xaa Pro Pro Xaa Pro Xaa Xaa 

47 1 5 

50 • 210 > SEQ ID NO: 2 

51 ■' 2 1 i > LENGTH : 2 0 

52 2 12 > TYPE: PRT 

53 • 2l3> ORGANISM: Artificial Sequence 

5 5 22" > FEATURE : 

56 •• 22 3> OTHER INFORMATION: Description of Artificial Sequence: Consensus 

57 sequence for SH-3 domain binding protein 
59 - 2 20 > FEATURE: 

6 0 .22 1 > NAME/KEY. MOD_RES 
61 -2 2 J > LOCATION: (3).. (7) 
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<;223> OTHER INFORMATION: Unknown amino acid 
-;JJ0^ FEATURE: 
<221> NAME/KEY: MOD_RES 
-;222> LOCATION: (13) 
<2 2 3> OTHER INFORMATION 
<:2 20> FEATURE: 
< j j 1 ^ NAME/KEY: MOD_RES 
<222> LOCATION: (15).. (16) 
<22 3 > OTHER INFORMATION: Val 
<4 00> SEQUENCE: 2 

Met Gly xaa Xaa Xaa Xaa Xaa Arg Pro Lea Pro Pro Xaa Pro Xaa Xaa 



Val, Ala, Gly, Leu, Pro or Arg 



Ala, Gly, Leu, Pro or Arg 



15 

Gly Gly Pro Pro 
20 

<2 10> SEQ ID NO: 3 
<211> LENGTH: 6 3 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<"5> OTHER^INFORMATION : Description of Artificial Sequence: Synthetic 
oligonucleotide consensus sequence for SH-3 domain 
binding protein 
< 220> FEATURE: 

^221 > NAME/KEY: modif ied_base 
. 222 > LOCATION: (7 ) . . (8) 

c , 



;223> OTHER INFORMATION: a, 
-'•'2n> FEATURE: 



t or g 



021> NAME/KEY: modi f ied_base 
022> LOCATION: (10).. (11) 
< 2 23> OTHER INFORMATION: a, C, t or g 
<220> FEATURE: 

'\2 21> NAME/KEY: modif ied_base 

- 222> LOCATION: (13).. (14) 
*223> OTHER INFORMATION: a, C, t or g 
<2 20> FEATURE: 

-221> NAME/KEY: modif ied_base 
-2 22> LOCATION: (16).. (17) 

- 223> OTHER INFORMATION: a, C, t or g 
- 2 20> FEATURE : 

<221> NAME/KEY: modif ied__base 
•; 222> LOCATION: (19) . . (20) 
<.223> OTHER INFORMATION: a, c, t or g 

at^c™^^ Kagacctctg cctccasbKg gg Sb *s b *gg aggcccacct 60 

taa 

J10> SEQ ID NO: 4 
•211> LENGTH: 4 
< 212> TYPE: PRT 

- 213> ORGANISM: Artificial Sequence 
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127 <220> FEATURE : 

128 <:223,- OTHER INFORMATION: Description of Artiticial Sequence: Linker 

129 consensus sequence 

131 <:400> SEQUENCE: 4 

132 Gly Gly Gly Ser 

133 1 

136 <210> SEQ ID NO: 5 

137 <211> LENGTH: 69 

138 <:212> TYPE: PRT 

139 <213> ORGANISM: Artificial Sequence 

141 <220> FEATURE: 

142 <2J3> OTHER INFORMATION: Description of Artificial Sequence: Minibody 
14 3 presentation structure 

145 <4O0> SEQUENCE: 5 

146 Met Gly Arg Asn Ser Gin Ala Thr Ser Gly Phe Thr Phe Ser His Phe 

147 15 10 15 

149 Tyr Met Glu Trp Val Arg Gly Gly Glu Tyr He Ala Ala Ser Arg His 

150 J 20 25 30 

152 Lys His Asn Lys Tyr Thr Thr Glu Tyr Ser Ala Ser Val Lys Gly Arg 

153 35 ^ 40 45 

155 Tyr He Val Ser Arg Asp Thr Ser Gin Ser He Leu Tyr Leu Gin Lys 

156 50 55 60 

158 Lys Lys Gly Pro Pro 

159 65 

162 <210> SEQ ID NO: 6 

163 -:2il-> LENGTH : 82 

164 <212> TYPE : DNA 

165 <2I3> ORGANISM: Artificial Sequence 

167 <2 20> FEATURE: 

168 <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic 

169 oligonucleotide 

171 <4U0> SEQUENCE: 6 

172 aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg ctcactgagg 60 

173 ccgcccgggc aaagcccggg eg 82 

176 < 2 1 0 > SEQ ID NO: 7 

177 <211> LENGTH: 10 

178 <212> TYPE: PRT 

179 <: J 1 3 > ORGANISM : Artificial Sequence 

181 ••220> FEATURE: 

182 <:223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic 

183 linker 

185 <400> SEQUENCE: 7 

186 Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 

187 l 5 10 

190 <:2 10 > SEQ ID NO: 8 

191 <2U> LENGTH: 1866 

192 <2L2> TYPE: DNA 

193 <213> ORGANISM: adeno - assoc iated virus 2 
195 -:400> SEQUENCE: 8 
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196 atgccgggqt tttacgagat tgtgattaag gtccccagcg accttgacga gcatctgccc 60 
1.7 qgcatttctg acagctttgt gaactgggtg gccgagaagg aatgggagtt gccgccagat 120 
198 tctgacatgg atctgaatct gattgagcag gcacccctga ccgtggccga ^jgctgcag 180 
ic, cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc cggaggccct tttctttgtg 240 
200 caarttgaga agggagagag ctacttccac atgcacgtgc tcgtggaaac "ccggggtg 300 
^.,1 aaatccatgg ttttgggacg tttcctgagt cagattcgcg aaaaactgat tcagagaatt 360 
■>,P taccgcggga tcgagccgac tttgccaaac tggttcgcgg tcacaaagac cagaaatggc 420 
20 gcoggaggcg ggaaoaaggt ggtggatgag tgctacatcc ccaattactt gctccccaaa 4 0 

204 acccagcctg agctccagtg ggcgtggact aatatggaac agtatttaag cgcctgtttg 540 

205 aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc gcagacgcag 600 
,6 gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag a caaaaac 660 

.07 tc gccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac ctcggagaag 720 
•08 cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc caactcgcgg 780 

".nl rcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac taaaaccgcc 840 
HO cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg gatttataaa 900 

2 attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct gggatgggcc 960 

• acgaa'aagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac taccgggaag 1020 

2 accaaeatcg cggaggccat agcccacact gtgcccttct acgggtgcgt aaactggacc 1080 

■14 aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg ggaggagggg 140 

■15 aagatgaocg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag caaggtgcgc 1.00 

216 gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat cgtcacctcc 1.60 

217 aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca ccagcagccg 13.0 

218 ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga ctttgggaag 1380 
gJcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt ^ttgaggtg 1440 

,20 gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc cagtgacgca 1 00 
221 gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac gtcagacgcg 1,60 
,22 gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca cgtgggcatg 
,23 aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc ^atatctgc 1680 

224 ttcacrcaeg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc tcaacccgtt 1740 

225 tctatcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat ggjajaggtg 1800 

226 ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg catctttgaa 1860 
221 caataa 

2 30 <210> SEQ ID NO: 9 
231 <211> LENGTH: 621 
2 32 <212> TYPE: PRT 

2 3 3 <213> ORGANISM: adeno - associated virus 2 
235 <400> SEQUENCE : 9 

■•36 Mot Pro Gly Phe Tyr Glu He Val lie Lys Val Pro Ser Asp Leu Asp 

2 39 Gly His Leu Pro Gly He Ser Asp Ser Phe Val Asn Trp Val Ala Glu 
•40 20 25 ^0 

M2 Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu He 
"'43 35 40 45 

MS Rlii Gin Ala Pro Leu Thr Val Ala Glu Lys Leu Gin Arg Asp Phe Leu 
•46 50 55 60 

248 Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val 

7 0 "7 5 

251 Gin Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu 

252 85 90 95 
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254 Thr Thr Gly Val Lys Ser Met Val Leu Gly Arq Phe Leu Ser Gin lie 

255 100 105 110 

2 5" 7 Arg Glu Lys Leu lie Gin Arg He Tyr Arg Gly lie Glu Pro Thr Leu 
258 115 120 125 

260 Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly 
26 L 130 135 14 0 

26.} Asn Lys Val Val Asp Glu Cys Tyr He Pro Asn Tyr Leu Leu Pro Lys 
264 145 150 155 160 

266 Thr Gin Pro Glu Leu Gin Trp Ala Trp Thr Asn Mot Glu Gin Tyr Leu 

267 165 170 175 

269 Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gin His 

2 7 0 180 185 190 

^-2 Leu Thr His Val Ser Gin Thr Gin Glu Gin Asn Lys Glu Asn Gin Asn 
.:7 3 195 200 205 

27 5 Pro Asn Ser Asp Ala Pro Val He Arg Ser Lys Thr Ser Ala Arg Tyr 
276 210 215 220 

278 Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly lie Thr Ser Glu Lys 

279 225 230 235 240 

281 Gin Trp lie Gin Glu Asp Gin Ala Ser Tyr He Ser Phe Asn Ala Ala 

282 245 250 255 

284 Ser Asn Ser Arg Ser Gin He Lys Ala Ala Leu Asp Asn Ala Gly Lys 

285 260 265 270 

287 He Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gin Gin 

288 275 280 285 

290 Pro Val Glu Asp He Ser Ser Asn Arg He Tyr Lys He Leu Glu Leu 
2"'U 290 295 300 

29 3 Asn Gly Tyr Asp Pro Gin Tyr Ala Ala. Ser Val Phe Leu Gly Trp Ala 
294 305 310 315 320 

2'j6 Thr Lys Lys Phe Gly Lys Arg Asn Thr He Trp Lou Phe Gly Pro Ala 
297 325 330 335 

299 Thr Thr Gly Lys Thr Asn He Ala Glu Ala He Ala His Thr Val Pro 

300 340 345 350 

302 Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp 

303 355 360 365 

30? cys Val Asp Lys Met Val He Trp Trp Glu Glu Gly Lys Met Thr Ala 
306 370 375 380 

308 Lys Val Val Glu Ser Ala Lys Ala He Leu Gly Gly Ser Lys Val Arg 
3"9 385 390 395 400 

3 11 Val Asp Gin Lys Cys Lys Ser Ser Ala Gin He Asp Pro Thr Pro Val 
312 40 5 410 415 

3 14 He Val Thr Ser Asn Thr Asn Met Cys Ala Val lie Asp Gly Asn Ser 
3 15 4 20 425 4 30 

3 17 Thr Thr Phe Glu His Gin Gin Pro Leu Gin Asp Arq Met Phe Lys Phe 
318 435 440 44 5 

320 Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gin 

321 450 455 460 

323 Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val 

324 465 470 475 480 
326 Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala 
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RAW SEQUENCE LISTING ERROR SUMMARY DATE: 05/31/2002 

PATENT APPLICATION: US/10/082 , 67 1 TIME: 12:15:58 

Input Set : A:\Xen-001.app 

Output Set: N:\CRF3\05312002\J082671.raw 

Please Note : 

Use of n and/or Xaa have been detected in the Sequence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> 
to <223> fields of each sequence which presents at least one n or Xaa. 

Seq#:l; Xaa Pos . 1 , 2 , 3 , 6 , 9 ' 
Seqtf:2; Xaa Pos. 3,4,5,6,7,13,15,16 
Seq#:3; N Pos . 7,8,10,11,13,14,16,17,19,20 
Seq#:58; Xaa Pos. 3,4,5,6 
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